Five Port Router for Network on Chip 


Swati Malviya (M.Tech. -Digital Communication) 
Anurag Jaiswal (Capgemini Consulting India Pvt. Ltd., 
Former HOD-IT, FMS, MITS University) 
mems_anurag@rediff.com, swati_malviya4@yahoo.co.in 


Abstract 


Multiprocessor system on chip is emerging as a new trend for System on chip design but the wire 
and power design constraints are forcing adoption of new design methodologies. Researchers pursued a 
scalable solution to this problem i.e. Network on Chip (NOC). 
Network on chip architecture better supports the integration of SOC consists of on chip packet switched 
network. Thus the idea is borrowed from large scale multiprocessors and wide area network domain and 
envisions on chip routers based network. Cores access the network by means of proper interfaces and 
have their packets forwarded to destination through multihop routing path. 

In order to implement a competitive NOC architecture, the router should be efficiently design as it 
is the central component of NOC architecture. In this paper we implement a parallel router which can 
support five requests simultaneously. Thus the speed of communication can be increased after reducing 
communication bottleneck by using simplest routing mechanism, flow mechanism and decoding logic. 


1. Introduction 


System on chip is a complex interconnection of various functional elements. It creates communication 
bottleneck in the gigabit communication due to its bus based architecture. Thus there was need of 
system that explicits modularity and parallelism, network on chip possess many such attractive 
properties and solve the problem of communication bottleneck. It basically works on the idea of 
interconnection of cores using on chip network. 


The communication on network on chip is carried out by means of router, so for implementing 
better NOC , the router should be efficiently design. This router supports five parallel connections at the 
same time. It uses store and forward type of flow control and XY deterministic routing which improves 
the performance of router. The switching mechanism used here is packet switching which is generally 
used on network on chip. In packet switching the data the data transfers in the form of packets between 
cooperating routers and independent routing decision is taken. The store and forward flow mechanism is 
best because it does not reserve channels and thus does not lead to idle physical channels. The routing 
used here is XY routing due to its simplicity and low overhead. The arbiter is of rotating priority scheme 
so that every channel once get chance to transfer its data. In this router both input and output buffering is 
used so that congestion can be avoided at both sides. 


2. Background 


The router used here is it avoid congestion and communication bottleneck. Although there are number of 
router implementation has already been done. Some of the related works are included here. Marescaux 
presented the implementation of router for NOC based system which has 2D torus network toplogy, XY 
blocking and deterministic routing. Packet size was 16 bits and 3 control bits. The main drawback here 
was it was a 2D torus formed using 1D router which creates a serious bottleneck in traffic. Zerferino 
presented a soft core router for NOC, the problem with this router implementation was it uses 4 flit 
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buffer having 8 bit implementation which is quite high. Its input and output channel has four distinct 
blocks and uses a large decoding logic. Moraes also presented its work but the drawback with it was that 
its packet has two headers which is quite expensive. The buffer here is present only with input channel. 
The absence of output buffer creates a serious problem in the implementation of router as it increases the 
problem of congestion. Our paper removes most of the problems cited above and improves the 
performance of router. 


3. Design 


The router consists of five ports east, west, north, south and local port and a central crosspoint matrix. 
Each port has its input channel and output channel. Data packet moves in to the input channel of one 
port of router by which it is forwarded to the output channel of other port. Each input channel and output 
channel has its own decoding logic which increases the performance of the router. Buffers are present at 
all ports to store the data temporarily. The buffering method used here is store and forward. Control 
logic is present to make arbitration decisions. Thus communication is established between input and 
output ports. The connection or configuration is made between both with the central crosspoint matrix. 
According to the destination path of data packet, control bit lines of crosspoint matrix are set. The 
movement of data from source to destination is called switching mechanism. The packet switching 
mechanism is used here, in which the flit size is 8 bits .Thus the packet size varies from 8 bits to 120 
bits. A detailed explanation of Design is as follows. 
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Figure 1 Router Architecture 
a. Input Channel: 


There is one input channel at each port and each has its control and decoding logic. It consists of 
main three parts i.e. FIFO, FSM, XY logic. 


FIFO is used as input buffer to store the data temporarily. The size of FIFO is 8 bits and depth is 
of 16 bits. The first 8 bits are the header which consists of coordinates of destination path. In this way 
the size of packet varies from 8 bits to 120 bits. The status of FIFO decides the communication can start 
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or not. If the FIFO is empty the data can be written in it and communication can start .If FIFO is full, 
data can be read or can be forwarded to its destination router. Inside the router Grant /acknowledgement 
signals are used to access the FIFO .The read and write operation of FIFO is controlled by FSM. 


FSM controls the read and write operation of FIFO according to its status. If FIFO is empty and 
having space to store the data, FSM will generate acknowledgement signal in respect to the request 
coming to input channel, thus write operation starts. If FIFO is full or not having space to store the data, 
the write operation stops and the acknowledgement signal goes low. When the FIFO is full, FSM will 
send request to output channel of other port, if grant signal is received by it then read operation starts 
and continues until grant signal goes low or FIFO empties. Thus empty status of indicates the end of 
communication. 


XY logic is the deterministic logic which analyses the header of data and send it to its 
destination port. The first four bits of the header are the coordinates of destination port. In XY logic a 
comparator is used which compares the header of the data to the locally stored X and Y coordinate and 
send the packet according to its destination address. Let the coordinates stored in header be Hx and Hy 
and locally stored coordinates be X and Y. So according to XY logic if Hx >X then packet will move to 
east port otherwise move to west port. If Hx=X then Y coordinate is compared, If Hy >Y then packet 
will move to north port otherwise move to south port, when Hy=Y then packet will move to local port. 
Thus in this way XY logic will send the packet to the output channel of its destination port. 
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Figure 2 Input Channel 


b. Output Channel 


Each port of the router contains one output channel which has its control and decoding logic. It also 
consists of three parts i.e. FIFO, FSM and arbiter. FIFO and FSM are same as in input channel but in 
place of XY logic, arbiter is used in output channel. 


FIFO in output channels used as output buffer to store the data temporarily. FIFO is of size 8 bits 
and is of depth 16 bits. The first 8 bits are the header which is the coordinates of destination router. Thus 
size of packet varies from 8 bits to 120 bits. The status of FIFO decides the communication can start or 
not. If the FIFO is empty the data can be write and communication can start. If FIFO is full, data can be 
read or can be forwarded to its destination router. Inside the router the Grant/Acknowledgement signals 
are used to access the FIFO. The read and write operation of FIFO is controlled by FSM. 


FSM controls the read and write operation of FIFO according to its status. If FIFO is empty or 
having enough space to store the data, FSM will give acknowledgement signal in respect to the request 
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coming from input channel, thus write operation starts. If FIFO is full or not having enough space to 
store the data, write operation terminates and the acknowledgement signal goes low. When FIFO is full, 
FSM will send request to other router, if grant signal is received by it, then read operation starts and 
continues until grant goes low or FIFO empties. 


Arbiter is used in output channel in place of XY logic in input channel. Arbiter is used to solve 
the problem of multiple requests coming at single output port. When there are more than one request 
coming from one input channels to a single output channel, arbiter selects one of the request and serve it 
Arbiter is used in rotating priority scheme in which east has highest priority, then west, north, south and 
then local port. As it is rotating priority scheme, the priority of port reduces once it has been served. 
Thus this sheme increases the performance of router as each port gets chance to send its data. 
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Figure 3 Output Channel 


c. Crosspoint Matrix 


Crosspoint matrix is the important and central component of router. It consists of various 
multiplexers and demultiplexers and interconnection of all 5 input channels and 5 output channel. It 
allows all five connections at the same time by configuring input and output channel. The control of 
crosspoint matrix is under output channel as while granting the request of any input channel, output 
channel configures the multiplexers and demultiplexers of it. 


Since each input channel has FSM and XY logic and each output channel has its FSM and arbiter, 
there is no time lag in making the connection and all five requests can be granted at the same time 
simultaneously, which increases the performance of the router. 
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Figure 4 Crossbar Switch 
4. Conclusion 


Here we have proposed the router which supports five connections at the same time without any 
communication bottleneck. We have used simplest decoding logic, store and forward flow mechanism, 
packet switching, XY deterministic routing, input and output buffering which increases the performance 
of router. In future we intend to build advanced router prototype supporting high level 
protocol(HLP)having flit size of 16 bits and all 8 bits of header to be used. 
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